48 research outputs found

    Unlocking the transcriptomic potential of formalin-fixed paraffin embedded clinical tissues: Comparison of gene expression profiling approaches

    Get PDF
    Background: High-throughput transcriptomics has matured into a very well established and widely utilised research tool over the last two decades. Clinical datasets generated on a range of different platforms continue to be deposited in public repositories provide an ever-growing, valuable resource for reanalysis. Cost and tissue availability normally preclude processing samples across multiple technologies, making it challenging to directly evaluate performance and whether data from different platforms can be reliably compared or integrated. Methods: This study describes our experiences of nine new and established mRNA profiling techniques including Lexogen QuantSeq, Qiagen QiaSeq, BioSpyder TempO-Seq, Ion AmpliSeq, Nanostring, Affymetrix Clariom S or U133A, Illumina BeadChip and RNA-seq of formalin-fixed paraffin embedded (FFPE) and fresh frozen (FF) sequential patient-matched breast tumour samples. Results: The number of genes represented and reliability varied between the platforms, but overall all methods provided data which were largely comparable. Crucially we found that it is possible to integrate data for combined analyses across FFPE/FF and platforms using established batch correction methods as required to increase cohort sizes. However, some platforms appear to be better suited to FFPE samples, particularly archival material. Conclusions: Overall, we illustrate that technology selection is a balance between required resolution, sample quality, availability and cost

    An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

    Get PDF
    For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types

    DNA defects, epigenetics, and gene expression in cancer-adjacent breast: A study from the cancer genome atlas

    Get PDF
    Recurrence rates after breast-conserving therapy may depend on genomic characteristics of cancer-adjacent, benign-appearing tissue. Studies have not evaluated recurrence in association with multiple genomic characteristics of cancer-adjacent breast tissue. To estimate the prevalence of DNA defects and RNA expression subtypes in cancer-adjacent, benign-appearing breast tissue at least 2 cm from the tumor margin, cancer-adjacent, pathologically well-characterized, benign-appearing breast tissue specimens from The Cancer Genome Atlas project were analyzed for DNA sequence, copy-number variation, DNA methylation, messenger RNA (mRNA) sequence, and mRNA/microRNA expression. Additional samples were also analyzed by at least one of these genomic data types and associations between genomic characteristics of normal tissue and overall survival were assessed. Approximately 40% of cancer-adjacent, benign-appearing tissues harbored genomic defects in DNA copy number, sequence, methylation, or in RNA sequence, although these defects did not significantly predict 10-year overall survival. Two mRNA/microRNA expression phenotypes were observed, including an active mRNA subtype that was identified in 40% of samples. Controlling for tumor characteristics and the presence of genomic defects, this active subtype was associated with significantly worse 10-year survival among estrogen receptor (ER)-positive cases. This multi-platform analysis of breast cancer-adjacent samples produced genomic findings consistent with current surgical margin guidelines, and provides evidence that extratumoral RNA expression patterns in cancer-adjacent tissue predict overall survival among patients with ER-positive disease

    Driver Fusions and Their Implications in the Development and Treatment of Human Cancers.

    Get PDF
    Gene fusions represent an important class of somatic alterations in cancer. We systematically investigated fusions in 9,624 tumors across 33 cancer types using multiple fusion calling tools. We identified a total of 25,664 fusions, with a 63% validation rate. Integration of gene expression, copy number, and fusion annotation data revealed that fusions involving oncogenes tend to exhibit increased expression, whereas fusions involving tumor suppressors have the opposite effect. For fusions involving kinases, we found 1,275 with an intact kinase domain, the proportion of which varied significantly across cancer types. Our study suggests that fusions drive the development of 16.5% of cancer cases and function as the sole driver in more than 1% of them. Finally, we identified druggable fusions involving genes such as TMPRSS2, RET, FGFR3, ALK, and ESR1 in 6.0% of cases, and we predicted immunogenic peptides, suggesting that fusions may provide leads for targeted drug and immune therapy

    Associations of obesity and circulating insulin and glucose with breast cancer risk: a Mendelian randomization analysis.

    Get PDF
    BACKGROUND: In addition to the established association between general obesity and breast cancer risk, central obesity and circulating fasting insulin and glucose have been linked to the development of this common malignancy. Findings from previous studies, however, have been inconsistent, and the nature of the associations is unclear. METHODS: We conducted Mendelian randomization analyses to evaluate the association of breast cancer risk, using genetic instruments, with fasting insulin, fasting glucose, 2-h glucose, body mass index (BMI) and BMI-adjusted waist-hip-ratio (WHRadj BMI). We first confirmed the association of these instruments with type 2 diabetes risk in a large diabetes genome-wide association study consortium. We then investigated their associations with breast cancer risk using individual-level data obtained from 98 842 cases and 83 464 controls of European descent in the Breast Cancer Association Consortium. RESULTS: All sets of instruments were associated with risk of type 2 diabetes. Associations with breast cancer risk were found for genetically predicted fasting insulin [odds ratio (OR) = 1.71 per standard deviation (SD) increase, 95% confidence interval (CI) = 1.26-2.31, p  =  5.09  ×  10-4], 2-h glucose (OR = 1.80 per SD increase, 95% CI = 1.3 0-2.49, p  =  4.02  ×  10-4), BMI (OR = 0.70 per 5-unit increase, 95% CI = 0.65-0.76, p  =  5.05  ×  10-19) and WHRadj BMI (OR = 0.85, 95% CI = 0.79-0.91, p  =  9.22  ×  10-6). Stratified analyses showed that genetically predicted fasting insulin was more closely related to risk of estrogen-receptor [ER]-positive cancer, whereas the associations with instruments of 2-h glucose, BMI and WHRadj BMI were consistent regardless of age, menopausal status, estrogen receptor status and family history of breast cancer. CONCLUSIONS: We confirmed the previously reported inverse association of genetically predicted BMI with breast cancer risk, and showed a positive association of genetically predicted fasting insulin and 2-h glucose and an inverse association of WHRadj BMI with breast cancer risk. Our study suggests that genetically determined obesity and glucose/insulin-related traits have an important role in the aetiology of breast cancer

    Oncogenic Signaling Pathways in The Cancer Genome Atlas

    Get PDF
    Genetic alterations in signaling pathways that control cell-cycle progression, apoptosis, and cell growth are common hallmarks of cancer, but the extent, mechanisms, and co-occurrence of alterations in these pathways differ between individual tumors and tumor types. Using mutations, copy-number changes, mRNA expression, gene fusions and DNA methylation in 9,125 tumors profiled by The Cancer Genome Atlas (TCGA), we analyzed the mechanisms and patterns of somatic alterations in ten canonical pathways: cell cycle, Hippo, Myc, Notch, Nrf2, PI-3-Kinase/Akt, RTK-RAS, TGFb signaling, p53 and beta-catenin/Wnt. We charted the detailed landscape of pathway alterations in 33 cancer types, stratified into 64 subtypes, and identified patterns of co-occurrence and mutual exclusivity. Eighty-nine percent of tumors had at least one driver alteration in these one alteration potentially targetable by currently available drugs. Thirty percent of tumors had multiple targetable alterations, indicating opportunities for combination therapy

    The Cancer Genome Atlas Comprehensive Molecular Characterization of Renal Cell Carcinoma

    Get PDF
    Renal cell carcinoma(RCC) is not a single disease, but several histologically defined cancers with different genetic drivers, clinical courses, and therapeutic responses. The current study evaluated 843 RCC from the three major histologic subtypes, including 488 clear cell RCC, 274 papillary RCC, and 81 chromophobe RCC. Comprehensive genomic and phenotypic analysis of the RCC subtypes reveals distinctive features of each subtype that provide the foundation for the development of subtype-specific therapeutic and management strategies for patients affected with these cancers. Somatic alteration of BAP1, PBRM1, and PTEN and altered metabolic pathways correlated with subtype-specific decreased survival, while CDKN2A alteration, increased DNA hypermethylation, and increases in the immune-related Th2 gene expression signature correlated with decreased survival within all major histologic subtypes. CIMP-RCC demonstrated an increased immune signature, and a uniform and distinct metabolic expression pattern identified a subset of metabolically divergent (MD) ChRCC that associated with extremely poor survival

    Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas

    Get PDF
    Precision oncology uses genomic evidence to match patients with treatment but often fails to identify all patients who may respond. The transcriptome of these \u201chidden responders\u201d may reveal responsive molecular states. We describe and evaluate a machine-learning approach to classify aberrant pathway activity in tumors, which may aid in hidden responder identification. The algorithm integrates RNA-seq, copy number, and mutations from 33 different cancer types across The Cancer Genome Atlas (TCGA) PanCanAtlas project to predict aberrant molecular states in tumors. Applied to the Ras pathway, the method detects Ras activation across cancer types and identifies phenocopying variants. The model, trained on human tumors, can predict response to MEK inhibitors in wild-type Ras cell lines. We also present data that suggest that multiple hits in the Ras pathway confer increased Ras activity. The transcriptome is underused in precision oncology and, combined with machine learning, can aid in the identification of hidden responders. Way et al. develop a machine-learning approach using PanCanAtlas data to detect Ras activation in cancer. Integrating mutation, copy number, and expression data, the authors show that their method detects Ras-activating variants in tumors and sensitivity to MEK inhibitors in cell lines

    lncRNA Epigenetic Landscape Analysis Identifies EPIC1 as an Oncogenic lncRNA that Interacts with MYC and Promotes Cell-Cycle Progression in Cancer

    Get PDF
    We characterized the epigenetic landscape of genes encoding long noncoding RNAs (lncRNAs) across 6,475 tumors and 455 cancer cell lines. In stark contrast to the CpG island hypermethylation phenotype in cancer, we observed a recurrent hypomethylation of 1,006 lncRNA genes in cancer, including EPIC1 (epigenetically-induced lncRNA1). Overexpression of EPIC1 is associated with poor prognosis in luminal B breast cancer patients and enhances tumor growth in vitro and in vivo. Mechanistically, EPIC1 promotes cell-cycle progression by interacting with MYC through EPIC1's 129\u2013283 nt region. EPIC1 knockdown reduces the occupancy of MYC to its target genes (e.g., CDKN1A, CCNA2, CDC20, and CDC45). MYC depletion abolishes EPIC1's regulation of MYC target and luminal breast cancer tumorigenesis in vitro and in vivo. Wang et al. characterize the epigenetic landscape of lncRNAs genes across a large number of human tumors and cancer cell lines and observe recurrent hypomethylation of lncRNA genes, including EPIC1. EPIC1 RNA promotes cell-cycle progression by interacting with MYC and enhancing its binding to target genes

    Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines

    Get PDF
    The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects
    corecore